Deploying models to Azure Managed Endpoints using Azure Machine Learning

Following this tutorial. Note that this is the most up-to-date method using the Python SDK v2 azure-ai-ml, and all tutorials using azureml-core (e.g. dp100 or this demo) are outdated, see this blog for more info.

Steps: 1. Setup Azure Machine Learning workspace and Azure Managed Online Endpoint. 2. Deploy model 3. Test model 4. Delete endpoint

Take model from:

model_path = "../models/model.pkl"
workspace = "azure-ml-deployment-methods"
resource_group = "azure-cognitive-services-accelerator"

1. Setup

1.1 Setup Azure Machine Learning

First create a new Azure Machine Learning workspace in the Azure portal.

subscription_id = !az account show --query id

Connect to Azure ML workspace

from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential

client = MLClient(DefaultAzureCredential(), subscription_id[0][1:-1], resource_group, workspace)

1.2 Setup Azure Managed Online Endpoint

from azure.ai.ml.entities import ManagedOnlineEndpoint

endpoint = ManagedOnlineEndpoint(
    name="sample-model-endpoint",
    description="A sample ML model deployment to Managed Endpoints",
    auth_mode="key",
    tags={'area': 'diabetes', 'type': 'regression'},
)
client.online_endpoints.begin_create_or_update(endpoint).result()

2. Deploy model to endpoint

from azure.ai.ml.entities import ManagedOnlineDeployment, Model, Environment, CodeConfiguration

Configure deployment: select environment and model. Here we use a base Ubuntu image and add conda dependencies.

!mkdir files
%%writefile files/inference.py
import json, joblib, os
import numpy as np

def init():
    global model
    model = joblib.load(os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'model.pkl'))

def run(raw_data):
    predictions = model.predict(np.array(json.loads(raw_data)['data'])).tolist()
    return json.dumps(predictions)
%%writefile files/conda.yml
name: model-env
channels:
  - defaults
dependencies:
  - python=3.7
  - pip
  - pip:
    - scikit-learn
    - joblib
    - azureml-defaults
    - inference-schema[numpy-support]
model = Model(path="../models/model.pkl")

env = env = Environment(
    conda_file="files/conda.yml",
    image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest",
)

code_config = CodeConfiguration(code="files", scoring_script="inference.py")

Create deployment and connect to endpoint

deployment = ManagedOnlineDeployment(
    name="blue",
    endpoint_name="sample-model-endpoint",
    model=model,
    environment=env,
    code_configuration=code_config,
    instance_type="Standard_DS2_v2",
    instance_count=1,
)

Deploy. This creates a deployment which you can view in Azure Machine Learning Studio/Endpoints/<name>

client.online_deployments.begin_create_or_update(deployment).result()
endpoint.traffic = {"blue": 100}
client.begin_create_or_update(endpoint)

3. Test endpoint

Grab API key from Azure Machine Learning Studio/Endpoints/<name>/Consume

uri = client.online_endpoints.get(name="sample-model-endpoint").scoring_uri
key = ""
import requests
import json
import numpy as np

input_payload = json.dumps({
    'data': np.load("../data/diabetes.npz")["X_test"][:2].tolist(),
})

requests.post(uri, input_payload, headers={
    'Content-Type':'application/json', 
    'Authorization':('Bearer '+key)
}).json()
'[230.86996560914233, 241.27351292981544]'

4. Delete

client.online_endpoints.begin_delete(name="sample-model-endpoint")
!rm -r files
.....................................